1 Introduction

Bar plots are one the most common chart type out there and come in several varieties. In the previous lesson, we learned how to make bar plots and their circular counterparts with {ggplot2}.

This lesson will pivot from group comparisons to the practice of labeling in data visualization. Labels provide additional context, clarify data points, and enhance the overall readability of a plot. We’ll delve into the intricacies of labeling in ggplot2, focusing on geom_label() and geom_text() functions from {ggplot2}.

2 Learning Objectives

After this lesson, you will be able to:

  1. Use two different text geoms to label ggplots:
    • geom_text() for simple labels
    • geom_label() for emphasized labels
  2. Appropriately transform and summarize data in the appropriate format for different chart types.
  3. Adjust text placement to position labels on stacked, grouped, and percent-stacked bar plots.
  4. Adjust text placement to position labels on pie charts and donut plots.

3 Packages

We’ll utilize a combination of packages in this lesson to enhance our data visualizations:

  1. tidyverse: A collection of R packages for efficient data manipulation and visualization, including ggplot2.

  2. glue: Enables flexible string interpolation for dynamic text in plots.

  3. here: For project-relative file paths.

pacman::p_load(tidyverse, glue, here)

4 Introduction to text geoms in {ggplot2}

In {ggplot2}, adding labels is an exercise in precision and aesthetics. We’ll start with geom_text() for simple labeling and then move to geom_label() for labels with more emphasis. In this section we will introduce the difference between each of these functions with a simple bar plot, then we will get into more details on how to leverage them for stacked bars, grouped bars, normalized stacked bars, and circular plots.

First let’s practice using these functions on a simple bar plot made with fake data. Once we cover the fundamentals of the labeling syntax, we will apply these to real epidemiology data.

# Create example data frame
data <- data.frame(
  category = c("A", "B", "C"),
  count = c(10, 20, 15)
)

# Create the bar plot
ggplot(data, aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  theme_light()

We can easily add labels to our bars. We achieve this using the geom_text() function and telling the aes() which column to extract label text from:

# Add text labels on bars
ggplot(data, aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  theme_light() +
  # JUST ADD ONE GEOM LAYER!
  geom_text(aes(label = count)) # must provide variable to `label` argument

As you can see, it is pretty easy to improve your ggplot with a few lines of code. The rest of this lesson shows you multiple ways how to do so.

Next we’ll build on the simple code above to learn more about geom_text()and geom_label(). These two labeling geoms offer distinct approaches for adding text to plots, each with its unique characteristics and use cases.

  • geom_text(): This function places plain text directly onto the plot. It is best used when the background is not too busy, and the text does not need to stand out excessively.
# Basic labels with geom_text()
ggplot(data, aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  theme_light() +
  geom_text(aes(label = count)) 

While labels are indeed useful, the placement of our text is odd – neither on the bar, nor under the bar. Additionally, they quite small and difficult to make out. We can address this by making them bigger, and vertically adjusting their placement.

The geom_text() function comes with arguments that help you to align and position text labels - hjust and vjust: the horizontal and vertical justification to align text.

We can build on the code from the last example to add additional aesthetics to the function:

# Improved labels with geom_text()
ggplot(data, aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  # CUSTOMIZE AESTHETICS
  geom_text(aes(label = count), 
            vjust = -0.3, # Place the labels above the bars
            size = 5) + # Increase text size
  theme_light() +
  labs(title = "Bar Plot with Labels from geom_text()",
       x = "Group", y = "Total")

In this code, the vjust parameter is adjusted to position the text just above the bars, and the size can be adjusted to control text size.

Or, we can use the geom_label() function to draw a rectangle behind the text, enhancing contrast and legibility. This is especially useful for plots with complex backgrounds.

# Labels with geom_label()
ggplot(data, aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  # GEOM_LABEL()
  geom_label(aes(label = count),
             vjust = 1.3, # Move text down
             fill = "yellow") + # Color background rectangle
  theme_light() +
  labs(title = "Bar Plot with Labels from geom_label()",
       x = "Group", y = "Total")

In this code, the fill aesthetic in geom_label() can be adjusted to control background fill color of the labels, and this time the vjust parameter is adjusted to lower the labels.

Two distinct text geoms

Text geoms are useful for labeling plots. They can be used in combination with other geoms, such as geom_col(), to annotate the height of bars. - geom_text() adds only text to the plot

  • geom_label() draws a rectangle behind the text, making it easier to read

Setting a global {ggplot2} theme So far, we’ve added a theme function to each of our bar plots. We can use the theme_set() function to set a global theme for the rest of our plots, so that we don’t have to add it each time.

# Set a theme_light() for all ggplots in this lesson
theme_set(theme_light())

Now theme_light() will be automatically applied to every plot you draw.

5 Data: TB treatment outcomes in Benin

The above examples were straightforward enough, but real data is often more complex, especially when you have multiple subgroup and levels of aggregation. You’ll run into problems if you don’t correctly prepare the data. Let’s dive in.

The tb_outcomes dataset will serve as the foundation for our examples, providing us a rich set of data points to label.

tb_outcomes <- read_csv(here::here('data/benin_tb.csv'))
tb_outcomes

We will regenerate plots from the previous lesson to serve as the foundation for this lesson.

6 Labeling simple bar plots

# Normal bar plot - splendid! no precalculation required
tb_outcomes %>%  
  ggplot(aes(x = hospital, y = cases)) +
  geom_col(fill = "steelblue")

Great! Let’s add geom_text() like we did earlier.

# but if we try to add geom_text we have a problem...
tb_outcomes %>%  
  ggplot(aes(x = hospital, y = cases)) +
  geom_col(fill = "steelblue") + 
  geom_text(aes(label = cases))

Oh no! Rather than one number at the top of each bar, we’ve ended up with dozens of labels crowded at the bottom of each bar. Let’s investigate the issue.

Let’s take a look at our tb_outcomes data frame. geom_text() is taking values directly from the cases column and putting them on the y-axis.

tb_outcomes

Instead of having a single number at the top of each bar, we end up with a multitude of labels piled up at the bottom of each bar. There are 711 rows, which means that 711 labels have been added to the plot by geom_text().

Unlike our fictitious dataset where we had precalculated totals for each bar, with raw data, we often have non-aggregated cases.When we use geom_col(), it takes care of calculating totals for the bars, but geom_text() and geom_label() don’t do that for the labels.

To solve this problem, we will create a temporary dataset using {dplyr} functions, allowing us to summarize the total number of cases per hospital.

6.1 Step 1: Summarize the data

First, we’ll begin by aggregating our data and then crafting a basic bar plot as our starting point:

  • Start by summarizing the data: This code chunk first groups our tb_outcomes dataset by hospital, and calculates the sum of cases (cases) for each hospital.
hospital_sums <- tb_outcomes %>% 
  # Group the data by health facility ('hospital') 
  group_by(hospital) %>% 
  # Get the total number of cases per hospital 
  summarise(cases_sum = sum(cases))

hospital_sums
## # A tibble: 6 × 2
##   hospital         cases_sum
##   <chr>                <dbl>
## 1 CHPP Akron             875
## 2 CS Abomey-Calavi       791
## 3 Hopital Bethesda       256
## 4 Hopital Savalou         80
## 5 Hopital St Luc         168
## 6 St Jean De Dieu        171

Now we have 6 values for cases_sum, which will serve as our 6 text labels, one for each bar.

6.2 Step 2: Create the base plot

Now let’s use hospital_sums to visualize each hospital’s total number of cases:

hosp_bar <- hospital_sums %>% 
  # Pass the summarized data to ggplot for visualization
  ggplot(
    # Within aes(), specify that the x-axis will represent the 'hospital', 
    # the height of the bars (y-axis) will represent the 'cases_sum', 
    aes(x = hospital, y = cases_sum)) +
  # Use geom_col() to create a blue bar plot 
  geom_col(fill = "steelblue") +
  labs(title = "New and relapse TB cases per hospital",
       subtitle = "Data from six health facilities in Benin, 2015-2017")

hosp_bar

This plot looks identical to the one we created with tb_outcomes but the key difference is that y-axis height of bars in hospital_sums is pre-calculated, rather than done within geom_col() .

6.3 Step 3: Annotate plot with geom_text() or geom_label()

Now, let’s use geom_text() to highlight the total number of TB cases by diagnosis type for each period in a bar plot.

# Give precalculated labels to geom_text()
hosp_bar + 
  geom_text(aes(label = cases_sum))

That’s much better!

For labels that need to stand out more, we can use geom_label() which draws a rectangle behind the text.

# Give precalculated labels to geom_label()
hosp_bar + 
  geom_label(aes(label = cases_sum))

6.4 Step 4: Adjust text placement and style

An important argument to use with geom_text() and other labeling functions in {ggplot2} is vjust. This argument adjusts the vertical position of the text - i.e., it moves it up or down.

You can set vjust to a negative number to move it upwards:

# Position text above the bars 
hosp_bar + 
  geom_text(aes(label = cases_sum), 
            # ADJUST VERTICAL HEIGHT UPWARDS
            vjust=-0.2) 

…or to a positive number to move the text downwards:

# Position text inside the bars 
hosp_bar + 
  geom_text(aes(label = cases_sum), 
            vjust=1.5) # ADJUST VERTICAL HEIGHT DOWNWARDS

We can use additional arguments in geom_text() to modify the color and size of the text.

# Add adjustments (fixed aesthetics) to geom_text() 
hosp_bar + 
  geom_text(aes(label = cases_sum), 
            vjust=1.5, 
            color="white", # Change text colour to white
            size=5) # Increase font size 

Aesthetic modifications So far we have only used some of the possible aesthetics for geom_text() and geom_label(). The minimum three aesthetics are x, y, and label. These must be mapped to a variable defined inside aes(). In addition to the required aesthetics, these geoms accept a number of optional aesthetics to customize the text. For example:

# Additional adjustments (fixed aesthetics) in geom_text()
hosp_bar + 
  geom_text(aes(label = cases_sum), 
            #vjust = 1.5,
            color = "white",
            size = 5,
            family = "serif",
            fontface = "bold",
            angle = 90,
            # Use hjust instead of vjust because text is rotated 90 degrees
            hjust = 1.2)

Learn more about setting these aesthetics in vignette("ggplot2-specs"). Run this code in your console and scroll down to the “Text” section of the vignette.

While this bar plot with just one categorical variable looks good, most of the time we’re dealing with multiple categorical variables that need plotting. We might need to stack them, align them side by side, or even transform them into circular plots. This is where things get more complex.

Let’s build plots with two categorical variables, each with multiple levels, and add labels to each subgroup. We’ll start with stacked bar plots.

7 Labeling stacked bar plots

First, we’ll create a base stacked bar plot, with no labels.

# Simple stacked bar plot
tb_outcomes %>% 
  ggplot(aes(x = period_date, y = cases)) +
  # Map fill to a categorical variable in geom_col() to stack the bars
  geom_col(aes(fill = diagnosis_type)) + 
  labs(title = "New and relapse TB cases per quarter",
       subtitle = "Data from six health facilities in Benin, 2015-2017")

The code is quite similar to our single-category bar plot, but instead of filling the bars in a specific color ("steelblue") as we did before, we’re now filling it by a second categorical variable.

As we’ve already seen, simply adding geom_text() to this code isn’t going to have the desired outcome.

# Stacked bar plot with geom_text() - faulty code
tb_outcomes %>% 
  ggplot(aes(x = period_date, y = cases)) +
  # Map fill to a categorical variable in geom_col() to stack the bars
  geom_col(aes(fill = diagnosis_type)) + 
  labs(title = "New and relapse TB cases per quarter",
       subtitle = "Data from six health facilities in Benin, 2015-2017") +
  geom_text(aes(label = cases))

Remember - we need to summarize the data to precalculate the text labels. The summary data frame should have the same number of rows as the number of labels we need. In this case, we will need a label for each bar segment - a total of 24 labels.

7.1 Step 1: Summarize the data

This code chunk first summarizes our tb_outcomes dataset by period_date and diagnosis_type, calculating the sum of cases (cases) for each group.

# Summarize the data by period and diagnosis type
tb_sum <- tb_outcomes %>% 
  group_by(period_date, diagnosis_type) %>% 
  summarise(cases = sum(cases))

tb_sum

7.2 Step 2: Create the base plot

  • Now, let’s create a simple stacked bar plot, where each bar’s height represents the total cases for a particular diagnosis in each period.
# Create a basic bar plot using the summarized data
quarter_dx_bar <- tb_sum %>% 
  ggplot(aes(x = period_date, y = cases, fill = diagnosis_type)) +
  geom_col() + 
  labs(title = "New and relapse TB cases per quarter",
       subtitle = "Data from six health facilities in Benin, 2015-2017")

quarter_dx_bar

7.3 Step 3: Annotate plot with geom_text() or geom_label()

  • Next, let’s add labels to this plot. We’ll use the cases column for labeling each bar.
# Add text labels to the bar plot
quarter_dx_bar +
  geom_text(aes(label = cases))

Oops, the labels are not in the right place! They don’t align with the height of the bars in our plot.

While this may not be immediately apparent in the code, it’s important to note that the default position "stack" when using fill = variable in the aes() function applies to geom_col() but not to geom_text(). Therefore, we have to explicitly specify the "stack" position in the geom_text() function to ensure the labels are placed correctly on the bars.

7.4 Step 4: Adjust text position to align with the bars

If we add a position argument to geom_text() and ask it to stack the labels, we will get labels at the top of each bar segment.

# Place text at the top of each bar segment
quarter_dx_bar +
  geom_text(
    aes(label = cases),
    position = position_stack()) # Set position to stack

Great!

Now, we want to vertically align the text inside the bars with vjust to geom_text().

# Reposition labels within the stacks for clarity and change font styling
quarter_dx_bar +
  geom_text(aes(label = cases),
    position = position_stack(),
    vjust = 1.5,
    color = "white",
    fontface = "bold")

That worked well, but rather than moving the labels up or down by a fixed amount, we can also specify the label height relative to the corresponding bar segment. To do this, we put the vjust parameter inside position_stack() and set it to 0.5, which this tells ggplot to center the labels vertically within each bar segment.

# To place text in the middle of each bar segment in a stacked barplot, you need to set the vjust parameter of position_stack()
quarter_dx_bar +
  geom_text(
    aes(label = cases),
    # ADD APPROPRIATE POSITION FUNCTION
    position = position_stack(vjust = 0.5),
    color = "white",
    fontface = "bold")

This label placement is especially nice for horizontal bar plots.

# Convert to horizontal plot with coord_flip()
quarter_dx_bar +
  geom_text(
    aes(label = cases),
    position = position_stack(vjust = 0.5),
    color = "white",
    fontface = "bold") + 
  # Swap orientation of the bars
  coord_flip()

That looks great! Let’s move on to grouped bar charts now.

8 Labeling Grouped Bar Charts

Grouped bar charts display multiple categories side by side. Let’s explore how to group the data and properly position labels for clear interpretation.

8.1 Step 1: Summarize the Data

To begin, we’ll group our dataset tb_outcomes by hospital and diagnosis_type, calculating the sum of cases (cases) for each group.

hospital_dx_cases <- tb_outcomes %>% 
  group_by(hospital, diagnosis_type) %>% 
  summarise(cases = sum(cases))

hospital_dx_cases

8.2 Step 2: Create the base plot

Next, let’s create a simple grouped bar chart, where the height of each bar signifies the total number of cases for a specific diagnosis in each hospital. The default parameter for geom_col is stack. To create a grouped bar chart, we’ll need to specify position = position_dodge().

# Use "dodge" instead of the default "stack"
hospital_dx_bar <- hospital_dx_cases %>% 
  ggplot(aes(x = hospital, y = cases, fill = diagnosis_type)) +
  geom_col(position = position_dodge())

hospital_dx_bar

8.3 Step 3: Annotate plot with geom_text() or geom_label()

Now, we can annotate the chart with geom_text() to display the labels, just as we’ve done before.

# Add geom_text()
hospital_dx_bar +
  geom_text(aes(label = cases))

Oops, that’s not quite right! The labels are vertically centered in a straight line, and they’re not aligned with the bars. Let’s take a lok at how we can fix that.

8.4 Step 4: Adjust text position to align with the bars

Just as with our stacked bar chart in the previous section, we need to add the position adjustment to geom_text() but this time we’re going to specify the position dodge.

# Specify dodge position for `geom_text()`
hospital_dx_bar +
  geom_text(aes(label = cases),
            position = position_dodge())

We get the same chart as before! When using geom_text() with the dodge positioning in ggplot2, specifying the width parameter becomes necessary because, by default, ggplot2 doesn’t know the appropriate width to use for shifting the labels.

For geom_col(), the default value of width is 0.9. Since we didn’t specify a different value when creating our chart, geom_col() used the default, so we’ll also use 0.9 for geom_text() to ensure the bars and labels are aligned.

# Instruct ggplot2 to align the text by adding the dodge width
hospital_dx_bar +
  geom_text(aes(label = cases),
            position = position_dodge(width = 0.9))

Great! Now all that’s left to do is shift the labels up a bit. You can’t move the text as easily with vjust, so the best thing to do is adjust the vertical position y by manually adding a small amount.

# Adjust labels vertically
hospital_dx_bar +
  geom_text(aes(label = cases, 
                y = cases + 20), # add 20 to the y-axis
            position = position_dodge(width = 0.9))

That looks great! Let’s move on to percent-stacked bar plots.

9 Labeling percent-stacked bar plots

Labeling percent-stacked bar plots is a special case because the y-axis ranges from 0 to 1 to represent proportions or percentages rather than raw values. Each segment within the bar represents a percentage of the total, and the sum of all segments totals 100%.

When labeling percent-stacked bar plots, the labels should reflect the percentages of each category. This means we need to format the labels into percentages to ensure they match the segments on the chart. By the end of this section, you’ll have recreated the graph below!

Percent-Stacked Bar Plot
Percent-Stacked Bar Plot

9.1 Step 1: Summarize the Data

We want to visualize the proportion of cases in each hospital belonging to each diagnostic type. So let’s calculate the total number of cases for each health facility (hospital) by diangostic type.

# Data preparation for labeling 100% stacked bar charts
hosp_dx_sum <- tb_outcomes %>%
  ## group by hospital and diagnosis type
  group_by(hospital, diagnosis_type) %>%
  ### summarize our data by summing the cases
  summarise(total_cases = sum(cases))

hosp_dx_sum

We could use this dataset to create a percent-stacked bar plot. You may remember from the last lesson that for percent stacked plots, we set the position to fill in geom_col() to normalize the y axis.

# Create a percent-stacked bar plot with the summarized data
hosp_dx_sum %>%
  ggplot(aes(x = hospital, y = total_cases, fill = diagnosis_type)) +
  ## Set position = position_fill() to normalize the y-axis
  geom_col(position = position_fill()) +
  geom_text(aes(label = total_cases),
  ## Set position = position_fill() to match the position of the bars
            position = position_fill(),
            vjust = 1.5,
            color = "white", fontface = "bold", size = 4.5) 

So this is a good start but we want percentages, not raw values.

In order to prepare our data, we’ll start by grouping our dataset tb_outcomes by hospital and diagnosis type. Then we’ll calculate the sum of cases for each combination and compute the proportion of bacteriologically confirmed and clinically diagnosed cases.

# Data preparation for labeling 100% stacked bar charts with PERCENTAGES
hosp_dx_prop <- tb_outcomes %>%
  ## group by hospital and diagnosis type 
  group_by(hospital, diagnosis_type) %>%
  ## first, we summarize our data by summing the cases, as usual
  summarise(total_cases = sum(cases)) %>% 
  ## calculate the proportions:
  ### add a new column with the proportions
  mutate(prop = total_cases / sum(total_cases))

hosp_dx_prop
## # A tibble: 12 × 4
## # Groups:   hospital [6]
##    hospital         diagnosis_type  total_cases  prop
##    <chr>            <chr>                 <dbl> <dbl>
##  1 CHPP Akron       bacteriological         695 0.794
##  2 CHPP Akron       clinical                180 0.206
##  3 CS Abomey-Calavi bacteriological         671 0.848
##  4 CS Abomey-Calavi clinical                120 0.152
##  5 Hopital Bethesda bacteriological         139 0.543
##  6 Hopital Bethesda clinical                117 0.457
##  7 Hopital Savalou  bacteriological          70 0.875
##  8 Hopital Savalou  clinical                 10 0.125
##  9 Hopital St Luc   bacteriological         149 0.887
## 10 Hopital St Luc   clinical                 19 0.113
## 11 St Jean De Dieu  bacteriological         100 0.585
## 12 St Jean De Dieu  clinical                 71 0.415

9.2 Step 2: Create the base plot

Next, let’s create a bar chart using our new dataset hosp_dx_prop with prop as our new y varibale :

# Create a normalized stacked bar chart with the summarized data
hosp_dx_fill <- hosp_dx_prop %>%
  ggplot(aes(x = hospital, y = prop, fill = diagnosis_type)) +
  geom_col(position = position_fill()) +
  labs(title = "Diagnosis of New and Relapse Tuberculosis Cases",
       subtitle = "Data from six health facilities in Benin, 2015-2017",
       x="", y = "Proportion", fill = "Diagnostic Method")

hosp_dx_fill

9.3 Step 3: Annotate plot with geom_text() or geom_label()

Now, we can use geom_text() and specify the position to the labels:

# Add text labels to the percent-stacked bar chart
hosp_dx_fill +
  geom_text(aes(label = prop),
            position=position_fill()) 

It’s a good start, but obviously, we still have some work to do to make it look nicer!

9.4 Step 4: Adjust text position to align with the bars

Before adjusting our labels, let’s handle those decimals. We could reduce the number of decimals like this:

hosp_dx_fill +
  geom_text(aes(label = round(prop,2)), # round label text to 2 sig figs
            position = position_fill()) 

However, the better method is this:

hosp_dx_fill +
  geom_text(aes(label = scales::percent(prop)),
            position = position_fill()) 

The {scales} package is commonly used with {ggplot2} for customizing aesthetics, transforming axis scales, formatting labels, defining color palettes, and more.

The scales::percent(prop) function we used in the code above with geom_text() converts the proportions (values from our prop variable) into a percentage format and adds percentage signs. We can also control the number of displayed digits using the accuracy argument (see below).

Next, we can center the labels within each bar segment using vjust in the position_fill() function. We’ll also set the accuracy argument to 1 to remove the decimals.

# Move label text to the middle of each bar segment
hosp_dx_fill + 
  geom_text(aes(label = scales::percent(prop, accuracy = 1)), # remove decimals
            position = position_fill(vjust = 0.5)) # center labels

It looks great, but I think we can do better! Using reversed coordinates in bar charts can greatly improve readability, especially when dealing with long category names or many different categories. To do this, we’ll use coord_flip with geom_text(), just as we did for stacked bar charts:

# Reverse coordinates for better visualization
hosp_dx_fill +
  geom_text(aes(label = scales::percent(prop, accuracy = 1)),
            position = position_fill(vjust = 0.5)) +
  coord_flip() 

Great, now we can add some additional aesthetic tweaks, and we’ll get the same chart we saw at the beginning of the section!

# Add additional aesthetic tweaks
hosp_dx_fill +
  theme_light() +
  geom_text(aes(label = scales::percent(prop, accuracy = 1)),
            position = position_fill(vjust = 0.5),
            color = "white", # Change text color
            fontface = "bold", # Make it bold
            size = 4.5) + # Change font size
  coord_flip() 

Amazing! Let’s move on to our last section where we’ll take a look at circular plots.

10 Labeling circular plots

10.1 Step 1: Summarize the Data

Let’s begin by summarizing the data. We’ll calculate the total number of cases for each hospital by grouping the data based on the hospital variable and then calculating the sum of cases in each group.

# New summary table - pie charts can visualize only a single categorical variable, so only one grouping this time
total_results <- tb_outcomes %>%
  group_by(hospital) %>%
  summarise(
    total_cases = sum(cases)) 

total_results

10.2 Step 2: Create the base plot

Now that we have our new dataset, let’s start by creating a simple bar chart. You may recall from the previous lesson that a pie chart is essentially a round version of a 100% stacked bar chart.

# Simple bar chart (precursor to the pie chart) 
results_stack <- ggplot(total_results,
       aes(x=4, # Set an arbitrary x value  
           y=total_cases,
           fill=hospital)) +
  geom_col()

results_stack

Now, we can create our basic pie chart. As we learned in the last lesson, to transform linear coordinates into polar coordinates, we use the coord_polar() function. The theta parameter defines which aesthetic variable should be mapped to the angular coordinate in the polar coordinate system. By specifying "y", we use the height of the bars to determine the angle of each slice in our pie chart.

outcome_pie <- results_stack +
  coord_polar(theta = "y")

outcome_pie

Great! This will serve as our base pie chart. Next, let’s create a base donut chart using xlim().

outcome_donut <- outcome_pie +
  xlim(c(0.2, 4.5))

outcome_donut

Alright, we’re ready to move on to labelling!

10.3 Step 3: Annotate plot with geom_text() or geom_label()

Now, let’s add labels to our pie chart using geom_text().

# Add geom text as you would for a regular stacked bar chart
outcome_pie +
  geom_text(aes(label = total_cases)) 

You’ll notice that our labels stay in the middle of the slices because coord_polar() is applied to both geom_col() and geom_text(). The numbers appear in the wrong segments because we haven’t added a position adjustment to the labeling geometry yet.

10.4 Step 4: Adjust the position of text to align with circle slices and ring sections

Now, just as we did previously, we will use the position_stack() argument with vjust to center the labels.

outcome_pie +
  geom_text(aes(label = total_cases), 
            position = position_stack(vjust = 0.5)) # Center the labels

To move the labels along the x-axis of our pie chart (up and down the radius), we can specify a fixed value to the x aesthetic in geom_text().

outcome_pie +
  geom_text(aes(label = total_cases,
                x = 4.25), # move the text away from the center   
            position = position_stack(vjust = 0.5)) 

We can do the same with geom_label().

# Similar adjustment with geom_label()
outcome_pie +
  geom_label(aes(label = total_cases,
                 x = 4.7), # move the text away from the center
            position = position_stack(vjust = 0.5))

Notice that once we used geom_label(), the letter ‘a’ appeared on the legend. It’s annoying and would ruin all the hard work we’ve done to make these charts presentable. To fix this issue, you can add the show.legend = FALSE argument to the geom_label() function like this:

# Similar adjustment with geom_label()
outcome_pie +
  geom_label(aes(label = total_cases,
                 x = 4.7), # move the text away from the center
            position = position_stack(vjust = 0.5),
            show.legend=FALSE) # remove letter "a" from legend

Next, let’s move on to our basic donut chart. We’ll label it using geom_text() and directly specifying the position, centering our labels in the middle of each section of the chart, just as we did for our pie chart.

# add text - lims applied to columns and text  
outcome_donut +
  geom_text(aes(label = total_cases), 
            position = position_stack(vjust = 0.5))

To finish, we can make some additional aesthetic adjustments. Here, we enhance the chart’s aesthetics by applying theme_void() to remove cluttered background elements, introducing a new color palette with scale_fill_viridis_d(), and adjusting the text labels using geom_text() with white and bold text for better visibility and contrast.

# Additional aesthetic modifications
outcome_donut +
  geom_text(aes(label = total_cases),
            position = position_stack(vjust = 0.5),
            color = "white",
            fontface = "bold") +
  theme_void() +
  scale_fill_viridis_d()

Congratulations, it looks great!

Wrap Up!

In this lesson, we delved into enhancing plots with labels, focusing on geom_label() and geom_text().

We started with geom_text(), demonstrating how to place readable text directly onto plots using the tb_outcomes dataset. Then we looked at geom_label() to create more prominent labels with background boxes, ideal for complex plot backgrounds.

This was followed by a discussion on using flipped coordinates in bar plots for enhanced readability and label visibility.

The lesson is a comprehensive guide to using labeling effectively in {ggplot2}, enhancing the clarity and visual appeal of data visualizations.

Solutions

  1. Understanding geom_text_repel() and geom_label_repel()

    B)  
  2. Implementing geom_text()

    A)  
  3. Using geom_label() for Emphasis

    C)  
  4. Formatting Labels with geom_richtext()

    A)  
  5. Flipped Coordinates and Axis Expansion

    C)  

References

Some material in this lesson was adapted from the following sources:

appendix

This work is licensed under the Creative Commons Attribution Share Alike license. Creative Commons License